8 research outputs found

    An Efficient Algorithm for Bulk-Loading xBR+ -trees

    Get PDF
    A major part of the interface to a database is made up of the queries that can be addressed to this database and answered (processed) in an efficient way, contributing to the quality of the developed software. Efficiently processed spatial queries constitute a fundamental part of the interface to spatial databases due to the wide area of applications that may address such queries, like geographical information systems (GIS), location-based services, computer visualization, automated mapping, facilities management, etc. Another important capability of the interface to a spatial database is to offer the creation of efficient index structures to speed up spatial query processing. The xBR + -tree is a balanced disk-resident quadtree-based index structure for point data, which is very efficient for processing such queries. Bulk-loading refers to the process of creating an index from scratch, when the dataset to be indexed is available beforehand, instead of creating the index gradually (and more slowly), when the dataset elements are inserted one-by-one. In this paper, we present an algorithm for bulk-loading xBR + -trees for big datasets residing on disk, using a limited amount of main memory. The resulting tree is not only built fast, but exhibits high performance in processing a broad range of spatial queries, where one or two datasets are involved. To justify these characteristics, using real and artificial datasets of various cardinalities, first, we present an experimental comparison of this algorithm vs. a previous version of the same algorithm and STR, a popular algorithm of bulk-loading R-trees, regarding tree creation time and the characteristics of the trees created, and second, we experimentally compare the query efficiency of bulk-loaded xBR + -trees vs. bulk-loaded R-trees, regarding I/O and execution time. Thus, this paper contributes to the implementation of spatial database interfaces and the efficient storage organization for big spatial data management

    Efficient query processing on large spatial databases A performance study

    Get PDF
    Processing of spatial queries has been studied extensively in the literature. In most cases, it is accomplished by indexing spatial data using spatial access methods. Spatial indexes, such as those based on the Quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints and objects. In this paper, we study a recent balanced disk-based index structure for point data, called xBR + -tree, that belongs to the Quadtree family and hierarchically decomposes space in a regular manner. For the most common spatial queries, like Point Location, Window, Distance Range, Nearest Neighbor and Distance-based Join, the R-tree family is a very popular choice of spatial index, due to its excellent query performance. For this reason, we compare the performance of the xBR + -tree with respect to the R ∗ -tree and the R + -tree for tree building and processing the most studied spatial queries. To perform this comparison, we utilize existing algorithms and present new ones. We demonstrate through extensive experimental performance results (I/O efficiency and execution time), based on medium and large real and synthetic datasets, that the xBR + -tree is a big winner in execution time in all cases and a winner in I/O in most cases

    New Plane-Sweep Algorithms for Distance-Based Join Queries in Spatial Databases

    Get PDF
    Efficient and effective processing of the distance-based join query (DJQ) is of great importance in spatial databases due to the wide area of applications that may address such queries (mapping, urban planning, transportation planning, resource management, etc.). The most representative and studied DJQs are the K Closest Pairs Query (KCPQ) and εDistance Join Query (εDJQ). These spatial queries involve two spatial data sets and a distance function to measure the degree of closeness, along with a given number of pairs in the final result (K) or a distance threshold (ε). In this paper, we propose four new plane-sweep-based algorithms for KCPQs and their extensions for εDJQs in the context of spatial databases, without the use of an index for any of the two disk-resident data sets (since, building and using indexes is not always in favor of processing performance). They employ a combination of plane-sweep algorithms and space partitioning techniques to join the data sets. Finally, we present results of an extensive experimental study, that compares the efficiency and effectiveness of the proposed algorithms for KCPQs and εDJQs. This performance study, conducted on medium and big spatial data sets (real and synthetic) validates that the proposed plane-sweep-based algorithms are very promising in terms of both efficient and effective measures, when neither inputs are indexed. Moreover, the best of the new algorithms is experimentally compared to the best algorithm that is based on the R-tree (a widely accepted access method), for KCPQs and εDJQs, using the same data sets. This comparison shows that the new algorithms outperform R-tree based algorithms, in most cases

    Structuring point data and processing spatial queries

    No full text
    The aim of the present thesis was to develop and study an improved version of the structure xBR-tree named xBR+-tree in the section of spatial data structuring. This index would have to be capable of organizing, querying and storing small and big spatial data. Construction methods (one-by-one insertion and bulk loading) and a deletion algorithm were developed. The xBR+-tree was compared experimentally with xBR-tree and popular R-trees in both types of spatial queries with one or two input data sets. We proposed two enhancements on algorithms using classic plain sweep for join queries (kCPQ, εDJQ) with two input spatial data sets stored in main memory. One new algorithm (Reverse Run Plain Sweep - RRPS) was developed in order to improve the query processing of that type of queries executed on data stored in main memory beforehand or partial loading. The experimental results showed that the algorithm RRPS always reduces the distance calculations, therefore accelerates the execution time. In the field of spatial query processing, on data stored in main memory, existing methods were studied and new ones were proposed in order to solve the k Group Nearest Neighbor problem. Finally, new algorithms were proposed using a combination of the plane sweep technique and space partitioning for kCPQ and εDJQ without utilizing any spatial indexing method over data sets stored in the secondary memory and studied. The new algorithms proved efficient. The best of the new algorithms proved more efficient in comparison to the best algorithm using the spatial structure R-tree.Σκοπός της διατριβής στον τομέα των μεθόδων δόμησης σημειακών δεδομένων πολύ μεγάλου όγκου ήταν η βελτίωση της δομής του δενδρικού χωρικού ευρετηρίου xBR-tree με μία νέα δομή (xBR+-tree). Αναπτύχθηκαν μέθοδοι κατασκευής του νέου ευρετηρίου με εισαγωγή μεμονωμένων και μαζική εισαγωγή δεδομένων και μέθοδος διαγραφής δεδομένων από τα xBR-tree. Μελετήθηκαν τα αποτελέσματα πειραμάτων σύγκρισης του xBR+-tree με το xBR-tree και με R-trees στη λειτουργία δόμησης και στην επεξεργασία χωρικών ερωτημάτων επί ενός ή δύο συνόλων δεδομένων. Προτάθηκαν επεκτάσεις των αλγορίθμων κλασικής τεχνικής σάρωσης επιπέδου για ερωτήματα σύζευξης δύο χωρικών συνόλων αποθηκευμένων στην κύρια μνήμη kCPQ και εDJQ με δύο προτάσεις βελτίωσης. Παρουσιάστηκε ένας νέος αλγόριθμος (Αντίρροπης Κίνησης Αλγόριθμος Σάρωσης – RRPS) για τη βελτίωση της επεξεργασίας των ερωτημάτων αυτών τόσο με δεδομένα εξολοκλήρου στην κύρια μνήμη ή επί τμημάτων των συνόλων δεδομένων που ανεβαίνουν επιλεκτικά στην κύρια μνήμη. Τα αποτελέσματα των πειραμάτων οδήγησαν στο συμπέρασμα ότι ο αλγόριθμος RRPS πάντοτε περιορίζει τους υπολογισμούς απόστασης άρα επιταχύνει χρονικά την εκτέλεση. Στο πεδίο των ερωτημάτων χωρικής σύζευξης με δεδομένα στην κύρια μνήμη μελετήθηκαν υπάρχοντες και προτάθηκαν νέοι αλγόριθμοι επίλυσης των k ομαδικών εγγύτερων γειτόνων. Τέλος, παρουσιάστηκαν και μελετήθηκαν νέοι αλγόριθμοι που χρησιμοποιούν ένα συνδυασμό τεχνικής σάρωσης και διαμέρισης του χώρου για τη σύζευξη των δεδομένων για ερωτήματα kCPQ και εDJQ, χωρίς τη χρήση κάποιου ευρετηρίου, με δεδομένα αποθηκευμένα στον δίσκο. Απεδείχθη ότι οι αλγόριθμοι RRPS είναι αποδοτικότεροι στα ανωτέρω ερωτήματα από τους κλασικής τεχνικής σάρωσης. Ο καλύτερος από τους νέους αλγόριθμους συγκρινόμενος πειραματικά με τον καλύτερο αλγόριθμο που χρησιμοποιεί τη χωρική δομή δεικτοδότησης R-tree αναδείχθηκε ως αποδοτικότερος
    corecore